CLIR Using Web Directory at NTCIR-4

نویسندگان

  • Fuminori Kimura
  • Akira Maeda
  • Shunsuke Uemura
چکیده

In this paper, we propose a CLIR method which employs a Web directory provided in multiple language versions (such as Yahoo!). In the proposed method, feature terms are first extracted from Web documents for each category in the source and the target languages. In advance, category matching is conducted in order to category pairs between categories across languages. Using these category pairs, we intend to resolve ambiguities of simple dictionary translation by narrowing the categories to be retrieved in the target language. At NTCIR-4, we participated in the JapaneseEnglish cross-language track. We submitted TITLE run and DESCRIPTION run. From the analysis of the experimental results, we found that the translation failure of proper nouns causes serious influence for retrieval results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RMIT Chinese-English CLIR at NTCIR-4

We participated in the Chinese-English CLIR task, concentrating primarily on the issues of translation disambiguation and automatic translation extraction of OOV terms. A new technique to identify and translate Chinese OOV terms using the web was developed. The results for this aspect of our work appears promising.

متن کامل

Using the Web for Translation Disambiguation: RMIT University at NTCIR-5 Chinese-English CLIR

RMIT University participated in the Chinese– English NTCIR-5 CLIR task. In previous work, researchers have relied on the test collection corpus to perform translation disambiguation in CLIR. In a production system, one would not be able to use a constrained test collection for disambiguation. Therefore we are interested to see how well our techniques perform when the web is used to provide cont...

متن کامل

Overview of CLIR Task at the Third NTCIR Workshop

This report is an overview of Cross-Language Information Retrieval Task (CLIR) at the third NTCIR Workshop. There are 3 tracks in CLIR: Single Language IR (SLIR), Bilingual CLIR (BLIR), and Multilingual CLIR (MLIR). The scope, schedule, test collections, search results, relevance judgment, scoring results, and the preliminary analyses are described in the report.

متن کامل

NTCIR-6 CLIR Experiments at Osaka Kyoiku University - Term Expansion Using Online Dictionaries and Weighting Score by Term Variety

This paper describes experimental results of J-J subtask of NTCIR-6 CLIR. We expanded query term using online dictionaries in a WEB. It was effective for some topics of which average precision was low. Probabilistic model were employed for scoring, and we modified this score multiplying by the number of varieties of query terms, also. In most cases this works well. Query term reduction should b...

متن کامل

ISCAS in English-Chinese CLIR at NTCIR-5

We participated in the Chinese single language information retrieval(SLIR) C-C task and EnglishChinese cross-language information retrieval(CLIR) E-C tasks in NTCIR5. Our project concentrates on the two aspects of the CLIR research: 1) We test various IR models especially language models for Chinese SLIR using the training corpus provided by the NTCIR organizer, and different smoothing methods ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004